Chinese Named Entity Recognition Method in History and Culture Field Based on BERT

نویسندگان

چکیده

Abstract With rapid development of the Internet, people have undergone tremendous changes in way they obtain information. In recent years, knowledge graph is becoming a popular tool for public to acquire knowledge. For Chinese history and culture, most researchers adopted traditional named entity recognition methods extract information from unstructured historical text data. However, method has certain defects, it easy ignore association between entities. To entities large amount cultural more accurately efficiently, this paper proposes one model combining Bidirectional Encoder Representations Transformers Long Short-Term Memory-Conditional Random Field (BERT-BiLSTM-CRF). First, BERT pre-trained language used encode single character vector representation corresponding each character. Then Memory (BiLSTM) layer applied semantically input text. Finally, label with highest probability output through Conditional (CRF) character’s category. This uses (BERT) replace static word vectors trained way. comparison, can dynamically generate semantic according context words, which improves ability vectors. The experimental results prove that proposed achieved excellent task field culture. Compared existing identification methods, precision rate, recall $$F_1$$ F 1 value been significantly improved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Named Entity Recognition and Disambiguation Based on Wikipedia

This paper presents a method for named entity recognition and disambiguation based on Wikipedia. First, we establish Wikipedia database using open source tools named JWPL. Second, we extract the definition term from the first sentence of Wikipedia page and use it as external knowledge in named entity recognition. Finally, we achieve named entity disambiguation using Wikipedia disambiguation pag...

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Chinese Named Entity Recognition Based on Hierarchical Hybrid Model

Chinese named entity recognition is a challenging, difficult, yet important task in natural language processing. This paper presents a novel approach based on a hierarchical hybrid model to recognize Chinese named entities. Three mutually dependent stages-boosting, Markov Logic Networks (MLNs) based recognition, and abbreviation detection are integrated in the model. AdaBoost algorithm is utili...

متن کامل

Transliterated Named Entity Recognition Based on Chinese Word Sketch

One of the unique challenges to Chinese Language Processing is cross-strait named entity recognition. Due to the adoption of different transliteration strategies, foreign name transliterations can vary greatly between PRC and Taiwan. This situation poses a serious problem for NLP tasks: including data mining, translation and information retrieval. In this paper, we introduce a novel approach to...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computational Intelligence Systems

سال: 2021

ISSN: ['1875-6883', '1875-6891']

DOI: https://doi.org/10.1007/s44196-021-00019-8